spectral component
- Asia > China (0.05)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Colorado > El Paso County > Colorado Springs (0.04)
- (3 more...)
712a67567ec10c52c2b966224cf94d1e-Supplemental.pdf
This section presents a more detailed description and results from our three sets of experiments: groundtruthrecovery,syntheticdata,andrealistictimeseries. Instead of making full use of the data, we consider only the first 100 points as training data, followed by testing with the subsequent 30 points. The likelihood surfaces adjacent to any of the dashed lines are mirror images of each other. The corresponding joint posterior distribution of the 22 hyperparameters are displayedinFig.5.
Catastrophic Forgetting Meets Negative Transfer: Batch Spectral Shrinkage for Safe Transfer Learning
Before sufficient training data is available, fine-tuning neural networks pre-trained on large-scale datasets substantially outperforms training from random initialization. However, fine-tuning methods suffer from two dilemmas, catastrophic forgetting and negative transfer. While several methods with explicit attempts to overcome catastrophic forgetting have been proposed, negative transfer is rarely delved into. In this paper, we launch an in-depth empirical investigation into negative transfer in fine-tuning and find that, for the weight parameters and feature representations, transferability of their spectral components is diverse. For safe transfer learning, we present Batch Spectral Shrinkage (BSS), a novel regularization approach to penalizing smaller singular values so that untransferable spectral components are suppressed. BSS is orthogonal to existing fine-tuning methods and is readily pluggable to them. Experimental results show that BSS can significantly enhance the performance of representative methods, especially with limited training data.
Deeply Learned Spectral Total Variation Decomposition
Non-linear spectral decompositions of images based on one-homogeneous functionals such as total variation have gained considerable attention in the last few years. Due to their ability to extract spectral components corresponding to objects of different size and contrast, such decompositions enable filtering, feature transfer, image fusion and other applications. However, obtaining this decomposition involves solving multiple non-smooth optimisation problems and is therefore computationally highly intensive. In this paper, we present a neural network approximation of a non-linear spectral decomposition. We report up to four orders of magnitude ( 10,000) speedup in processing of mega-pixel size images, compared to classical GPU implementations. Our proposed network, TVspecNET, is able to implicitly learn the underlying PDE and, despite being entirely data driven, inherits invariances of the model based transform. To the best of our knowledge, this is the first approach towards learning a non-linear spectral decomposition of images. Not only do we gain a staggering computational advantage, but this approach can also be seen as a step towards studying neural networks that can decompose an image into spectral components defined by a user rather than a handcrafted functional.
How Muon's Spectral Design Benefits Generalization: A Study on Imbalanced Data
Vasudeva, Bhavya, Deora, Puneesh, Zhao, Yize, Sharan, Vatsal, Thrampoulidis, Christos
The growing adoption of spectrum-aware matrix-valued optimizers such as Muon and Shampoo in deep learning motivates a systematic study of their generalization properties and, in particular, when they might outperform competitive algorithms. We approach this question by introducing appropriate simplifying abstractions as follows: First, we use imbalanced data as a testbed. Second, we study the canonical form of such optimizers, which is Spectral Gradient Descent (SpecGD) -- each update step is $UV^T$ where $UΣV^T$ is the truncated SVD of the gradient. Third, within this framework we identify a canonical setting for which we precisely quantify when SpecGD outperforms vanilla Euclidean GD. For a Gaussian mixture data model and both linear and bilinear models, we show that unlike GD, which prioritizes learning dominant principal components of the data first, SpecGD learns all principal components of the data at equal rates. We demonstrate how this translates to a growing gap in balanced accuracy favoring SpecGD early in training and further show that the gap remains consistent even when the GD counterpart uses adaptive step-sizes via normalization. By extending the analysis to deep linear models, we show that depth amplifies these effects. We empirically verify our theoretical findings on a variety of imbalanced datasets. Our experiments compare practical variants of spectral methods, like Muon and Shampoo, against their Euclidean counterparts and Adam. The results validate our findings that these spectral optimizers achieve superior generalization by promoting a more balanced learning of the data's underlying components.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > British Columbia (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Physics-Informed Spectral Modeling for Hyperspectral Imaging
Gawrysiak, Zuzanna, Krawiec, Krzysztof
PhISM is based on the autoencoder blueprint and involves two stages: (i) autoassociative self-supervised and task-agnostic training of the autoencoder, to form informative latent representations that enable possibly accurate reconstruction of the input image (Section 2.1), and (ii) task-specific training of a prediction module that maps that latent
- North America > United States (0.14)
- Europe > Poland > Greater Poland Province > Poznań (0.06)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Qinghai Province > Xining (0.04)
- Asia > China (0.05)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Colorado > El Paso County > Colorado Springs (0.04)
- (3 more...)
Harmonic fractal transformation for modeling complex neuronal effects: from bursting and noise shaping to waveform sensitivity and noise-induced subthreshold spiking
We propose the first fractal frequency mapping, which in a simple form enables to replicate complex neuronal effects. Unlike the conventional filters, which suppress or amplify the input spectral components according to the filter weights, the transformation excites novel components by a fractal recomposition of the input spectra resulting in a formation of spikes at resonant frequencies that are optimal for sampling. This enables high sensitivity detection, robustness to noise and noise-induced signal amplification. The proposed model illustrates that a neuronal functionality can be viewed as a linear summation of spectrum over nonlinearly transformed frequency domain.
Reviews: Scalable Levy Process Priors for Spectral Kernel Learning
The paper proposes a spectral mixture of laplacian kernel with a levy process prior on the spectral components. This extends on the SM kernel by Wilson, which is a mixture of gaussians with no prior on spectral components. A RJ-MCMC is proposed that can model the number of components and represent the spectral posterior. A large-scale approximation is also implemented (SKI). The idea of Levy prior on the spectral components is very interesting one, but the paper doesn't make it clear what are the benefits with respect to kernel learning.